This data set contains transcribed audio data for Bengali. The data set consists
of wave files, and a TSV file. The file utt_spk_text.tsv contains a FileID,
anonymized UserID and the transcription of audio in the file.
<p>
The data set has been manually quality checked, but there might still be errors.
<p>
See LICENSE.txt file for license information.
<p>
Copyright 2016, 2017, 2018 Google, Inc.
<p>
If you use this data in publications, please cite it as follows:
<pre>
  @inproceedings{kjartansson-etal-sltu2018,
    title = {{Crowd-Sourced Speech Corpora for Javanese, Sundanese,  Sinhala, Nepali, and Bangladeshi Bengali}},
    author = {Oddur Kjartansson and Supheakmungkol Sarin and Knot Pipatsrisawat and Martin Jansche and Linne Ha},
    booktitle = {Proc. The 6th Intl. Workshop on Spoken Language Technologies for Under-Resourced Languages (SLTU)},
    year  = {2018},
    address = {Gurugram, India},
    month = aug,
    pages = {52--55},
    URL   = {http://dx.doi.org/10.21437/SLTU.2018-11}
  }
</pre>
